The TALP-UPC Spanish-English WMT Biomedical Task: Bilingual Embeddings and Char-based Neural Language Model Rescoring in a Phrase-based System

نویسندگان

  • Marta R. Costa-Jussà
  • Cristina España-Bonet
  • Pranava Swaroop Madhyastha
  • Carlos Escolano
  • José A. R. Fonollosa
چکیده

This paper describes the TALP–UPC system in the Spanish–English WMT 2016 biomedical shared task. Our system is a standard phrase-based system enhanced with vocabulary expansion using bilingual word embeddings and a characterbased neural language model with rescoring. The former focuses on resolving outof-vocabulary words, while the latter enhances the fluency of the system. The two modules progressively improve the final translation as measured by a combination of several lexical metrics.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The TALP-UPC Neural Machine Translation System for German/Finnish-English Using the Inverse Direction Model in Rescoring

In this paper, we describe the TALPUPC participation in the News Task for German-English and Finish-English. Our primary submission implements a fully character to character neural machine translation architecture with an additional rescoring of a n-best list of hypothesis using a forced back-translation to the source sentence. This model gives consistent improvements on different pairs of lang...

متن کامل

The TALP-UPC Phrase-Based Translation System for EACL-WMT 2009

This study presents the TALP-UPC submission to the EACL Fourth Worskhop on Statistical Machine Translation 2009 evaluation campaign. It outlines the architecture and configuration of the 2009 phrase-based statistical machine translation (SMT) system, putting emphasis on the major novelty of this year: combination of SMT systems implementing different word reordering algorithms. Traditionally, w...

متن کامل

The TALP-UPC Ngram-Based Statistical Machine Translation System for ACL-WMT 2008

This paper reports on the participation of the TALP Research Center of the UPC (Universitat Politècnica de Catalunya) to the ACL WMT 2008 evaluation campaign. This year’s system is the evolution of the one we employed for the 2007 campaign. Main updates and extensions involve linguistically motivated word reordering based on the reordering patterns technique. In addition, this system introduces...

متن کامل

The RWTH Aachen University English-Romanian Machine Translation System for WMT 2016

This paper describes the statistical machine translation system developed at RWTH Aachen University for the English→Romanian translation task of the ACL 2016 First Conference on Machine Translation (WMT 2016). We combined three different state-ofthe-art systems in a system combination: A phrase-based system, a hierarchical phrase-based system and an attentionbased neural machine translation sys...

متن کامل

The TALP n-gram-based SMT system for IWSLT 2007

This paper describes TALPtuples, the 2007 N -gram-based statistical machine translation system developed at the TALP Research Center of the UPC (Universitat Politècnica de Catalunya) in Barcelona. Emphasis is put on improvements and extensions of the system of previous years. Mainly, these include optimizing alignment parameters in function of translation metric scores and rescoring with a neur...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016